Supervised Learning of Term Similarities

نویسندگان

  • Irena Spasic
  • Goran Nenadic
  • Kostas Manios
  • Sophia Ananiadou
چکیده

In this paper we present a method for the automatic discovery and tuning of term similarities. The method is based on the automatic extraction of significant patterns in which terms tend to appear. Beside that, we use lexical and functional similarities between terms to define a hybrid similarity measure as a linear combination of the three similarities. We then present a genetic algorithm approach to supervised learning of parameters that are used in this linear combination. We used a domain specific ontology to evaluate the generated similarity measures and set the direction of their convergence. The approach has been tested and evaluated in the domain of molecular biology.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Bidirectional Label Propagation over Graphs

Graph-Based label propagation algorithms are popular in the state-of-the-art semi-supervised learning research. The key idea underlying this algorithmic family is to enforce labeling consistency between any two examples with a positive similarity. However, negative similarities or dissimilarities are equivalently valuable in practice. To this end, we simultaneously leverage similarities and dis...

متن کامل

Efficient Similarity Derived from Kernel-Based Transition Probability

Semi-supervised learning effectively integrates labeled and unlabeled samples for classification, and most of the methods are founded on the pair-wise similarities between the samples. In this paper, we propose methods to construct similarities from the probabilistic viewpoint, whilst the similarities have so far been formulated in a heuristic manner such as by k-NN. We first propose the kernel...

متن کامل

A Note on Semi-Supervised Learning using Markov Random Fields

This paper describes conditional-probability training of Markov random fields using combinations of labeled and unlabeled data. We capture the similarities between instances learning the appropriate distance metric from the data. The likelihood model and several training procedures are presented.

متن کامل

Semi-Supervised Learning Based Prediction of Musculoskeletal Disorder Risk

This study explores a semi-supervised classification approach using random forest as a base classifier to classify the low-back disorders (LBDs) risk associated with the industrial jobs. Semi-supervised classification approach uses unlabeled data together with the small number of labelled data to create a better classifier. The results obtained by the proposed approach are compared with those o...

متن کامل

Graph-based Learning for Statistical Machine Translation

Current phrase-based statistical machine translation systems process each test sentence in isolation and do not enforce global consistency constraints, even though the test data is often internally consistent with respect to topic or style. We propose a new consistency model for machine translation in the form of a graph-based semi-supervised learning algorithm that exploits similarities betwee...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002